Meeting 6- April 27th, 2021

  

Meeting Date: 04/27/2021

Meeting Time: 10:00am-12:00pm PDT

Meeting Location: Virtual Conference via Zoom

Approval Date: 5/25/2021

Recorded by: UCSF Team

MEETING MINUTES

Project Overview:

The Centers for Medicare & Medicaid Services (CMS) has granted an award to the University of California San Francisco (UCSF) to develop a measure of computed tomography (CT) image quality and radiation safety. The project is a part of CMS’s Medicare Access & CHIP Reauthorization Act (MACRA)/Measure Development for the Quality Payment Program. The project title is “DR CTQS: Defining and Rewarding Computed Tomography Quality and Safety”. The Cooperative Agreement number is 1V1CMS331638-02-00. As part of its measure development process, UCSF convened groups of stakeholders and experts who contributed direction and thoughtful input to the measure developer during measure development and maintenance.

Project Objectives:

The goal of the project is to create a quality measure for CT to ensure image quality standards are preserved and harmful effects of radiation used to perform the tests are minimized. Radiation doses delivered by CT are far higher than conventional radiographs (x-rays), the doses are in the range known to be carcinogenic, and there is a significant performance gap across health care organizations and clinicians which has consequences for patients. The goal of the measure is to provide a framework where health care organizations and clinicians can assess their doses, compare them to benchmarks, and take corrective action to lower them while preserving the quality of images so that they are useful to support clinical practice. The measure will be electronically specified using procedural and diagnostic codes in billing data as well as image and electronic data stored with CT scans, typically stored within the Picture Archiving and Communication Systems (PACS) – the computerized systems for reviewing and storing imaging data – or Radiology Information Systems (RIS).

TEP Objectives:

In its role as a measure developer, the University of California San Francisco is obtaining input from a broad group of stakeholders to develop a set of recommendations to develop a radiology quality and safety measure. The proposed measure will be developed with the close collaboration of the leadership from diverse medical societies as well as payers, health care organizations, experts in safety and accreditation, and patient advocates. A well-balanced representation of stakeholders on the TEP is intended to ensure the consideration of key perspectives and obtain balanced input.

Scope of Responsibilities:

The TEP’s role is to provide input and advice to the measure developer (University of California San Francisco) related to a series of planned steps throughout the 3-year project. The specific steps will include developing and testing a risk-adjusted measure which can be used to monitor CT image quality in the context of minimizing radiation doses while maintaining acceptable image quality. The TEP will assist UCSF in conceptualizing the measure and any appropriate risk adjustment of it. The TEP will assist UCSF with identifying barriers to implementing the proposed measure and test sites in which the developer can assess the feasibility and performance of its use. The TEP will assist UCSF with interpreting results obtained from the test sites and in suggesting modifications of the measure. The TEP will provide input and advice to UCSF to ensure that the measure is valuable for a wide range of stakeholders and CMS.

Guiding Principles:

Participation on the TEP is voluntary. Individuals participating on the TEP understand that their input will be recorded in the meeting minutes. Proceedings of the TEP will be summarized in a report that may be disclosed to the general public. If a participant has disclosed private, personal data by his or her own choice, then that material and those communications are not deemed to be covered by patient-provider confidentiality. Questions about confidentiality will be answered by the TEP organizers.

All TEP members must disclose any significant financial interest or other relationships that may influence their perceptions or judgment. It is unethical to conceal (or fail to disclose) conflicts of interest. However, the disclosure requirement is not intended to prevent individuals with particular perspectives or strong points of view from serving on the TEP. The intent of full disclosure is to inform the TEP organizers, other TEP members and CMS about the source of TEP members’ perspectives and how that might affect discussions or recommendations.

All TEP members should be able to commit to the anticipated time frame needed to perform the functions of the TEP.

Estimated Number and Frequency of Meetings:

TEP is expected to meet three times per year, either in-person or via a webinar.
This meeting was originally set to occur in-person but was changed to a virtual meeting as mandated by federal social distancing measures and state-wide Shelter-in-Place orders. 

Table 1. TEP Member Name, Title, and Affiliation

Name

Title

Organization

Attendees

Mythreyi Bhargavan Chatfield, PhD

Executive Vice President

American College of Radiology

Niall Brennan, MPP

CEO

Health Care Cost Institute

Helen Burstin, MD, MPH, FACP

Executive Vice President

Council of Medical Specialty Societies

Melissa “Missy” Danforth

Vice President of Health Care Ratings

The Leapfrog Group

Tricia Elliot, MBA, CPHQ

Director, Quality Measurement

Joint Commission

Jeph Herrin, PhD

Adjunct Assistant Professor

Yale University

Jay Leonard “Len” Lichtenfeld, MD, MACP

Independent Consultant

Formerly Deputy Chief Medical Officer American Cancer Society, Inc.

Leelakrishna “Krishna” Nallamshetty, MD

Associate Chief Medical Officer

Radiology Partners

 

Name

Title

Organization

Attendees

Matthew Nielsen, MD, MS

Professor and Chair of Urology

UNC Gillings School of Global Public Health

Debra Ritzwoller, PhD

Patient Advocate, and Health Economist

Patient Representative

Lewis “Lew” Sandy, MD

Executive Vice President, Clinical Advancement

UnitedHealth Group

Mary Suzanne “Suz” Schrandt, JD

Patient Advocate

Patient Representative

James Anthony “Tony” Seibert, PhD

Professor

University of California, Davis

Arjun Venkatesh, MD, MBA, MHS

Associate Professor, Emergency Medicine

Yale School of Medicine

Kenneth “Ken” Wang, MD, PhD

Adjunct Assistant Professor, Radiology

University of Maryland, Baltimore

Not in Attendance

Hedvig Hricak, MD, PhD

Radiology Chair

Memorial Sloan Kettering Cancer Center

Todd Villines, MD, FSCCT

Professor and Director of Cardiovascular Research and Cardiac CT Programs

University of Virginia

Ex Officio TEP

Amy Berrington de Gonzalez, DPhil

Branch Chief & Senior Investigator

National Cancer Institute; Division of Cancer Epidemiology & Genetics, Radiation Epidemiology Branch

Mary White, ScD

Chief, Epidemiology and Applied Research Branch

Centers for Disease Control and Prevention

CMS & MACRA/CATA Representatives

Janis Grady

Project Officer

Centers for Medicare & Medicaid Services

UC Team

Rebecca Smith-Bindman, MD

Principal Investigator

University of California, San Francisco

Patrick Romano, MD, MPH

Co-Investigator

University of California, Davis

Carly Stewart

Lead Project Manager

University of California, San Francisco

Sophronia Yu

Data Analyst

University of California, San Francisco

Susanna McIntyre

Research Assistant

University of California, San Francisco

Andrew Bindman, MD

Advisor

Kaiser Permanente, former Co-Investigator with the University of California, San Francisco


Technical Expert Panel Meeting
Prior to the meeting, TEP members received a copy of the agenda, presentation slides, link to DR-CTQS study website that contains minutes from the prior TEP meetings, honorarium documentation, and a conflict of interest form. The meeting was conducted with the use of PowerPoint slides and Zoom Video Conference. 

 

10:00 AM    Call meeting to order by TEP Chair (Dr. Sandy)

10:05 AM    Roll call and updated conflicts (Dr. Sandy)

TEP members’ attendance is listed above.

Conflict of interest defined as you, your spouse, your registered domestic partner, and/or your dependent children:

1. Received income or payment as an employee, consultant or in some other role for services or activities related to diagnostic imaging
2. Currently own, or have held in the past 12 months, an equity interest in any health care related company which includes diagnostic imaging as a part of its business
3. Hold a patent, copyright, license or other intellectual property interest related to diagnostic imaging
4. Hold a management or leadership position (i.e., Board of Directors, Scientific Advisory Board, officer, partner, trustee, etc.) in an entity with an interest in diagnostic imaging
5. Received and cash or non-cash gifts from organizations or entities with an interest in diagnostic imaging
6. Received any loans from organizations or entities with an interest in diagnostic imaging
7. Received any paid or reimbursed travel from organizations or entities with an interest in diagnostic imaging

COIs were disclosed to UCSF prior to this TEP meeting via paperwork. No members had financial conflicts that precluded their participation. TEP members were also asked to verbally disclose any COIs when introducing themselves for the purpose of group transparency. TEP members re-stated their affiliations and any existing conflicts. 


•    Dr. Lewis Sandy stated his affiliation with United Health Group and had no new conflicts to report.
•    Niall Brennan stated he is CEO of Health Care Cost Institute and had no new or existing conflicts.
•    Dr. Krishna Nallamshetty works at Radiology Partners where he serves as Associate Chief Medical Officer and chair of the patient safety committee. He is associate faculty at the University of South Florida in radiology and cardiology. He had no new conflicts to report.
•    Missy Danforth stated she is Vice President for healthcare ratings at the Leapfrog Group and had no new or existing conflicts.
•    Tricia Elliot stated her role as Director of Quality Measurement at The Joint Commission and had no new or existing conflicts.
•    Dr. Jeph Herrin stated his affiliation with Yale University and no new or existing conflicts. 
•    Dr. Leonard Lichtenfeld stated he is an independent consultant, formerly Deputy Chief Medical Officer at the American Cancer Society. He had no new conflicts to report.
•    Dr. Matthew Nielsen reported he is Chief of Urology at University of North Carolina, Chapel Hill. He had no new conflicts. 
•    Suzanne Schrandt introduced herself as the founder and CEO of ExPPect and a senior patient engagement advisory to the Society to Improve Diagnosis in Medicine; she had no new conflicts. 
•    Dr. Anthony Seibert stated his role as a medical physicist at UC Davis Health and had no conflicts to declare. 
•    Dr. Arjun Venkatesh stated he is an emergency physician on faculty at Yale University and had no new conflicts to report.
•    Dr. Kenneth Wang stated he works both for the VA Hospital in Baltimore and at the University of Maryland in Baltimore. Though he works for the federal government, he is here in his own personal capacity and not representing the government on this TEP. He had no new conflicts. 
•    Dr. Amy Berrington introduced herself as Branch Chief of Radiation Epidemiology at NCI and had no conflicts. 
•    Dr. Mary White stated she is Chief of the Epidemiology and Applied Research Branch in the Cancer Division at the Centers for Disease Control & Prevention and had no new conflicts.

TEP members Drs. Mythreyi Chatfeld, Helen Burstin, and Debra Ritzwoller joined the call after Roll Call. 

  • Dr. Mythreyi Chatfield stated she is Secretary Vice President for quality and safety at the American College of Radiology and had no new or existing conflicts. 
  • Dr. Helen Burstin stated her affiliation as the CEO of the Council of Medical Specialty Societies and had no new or existing conflicts of interest.
  • Dr. Debra Ritzwoller stated she is with Kaiser Permanente Colorado and is serving in the capacity of a patient advocate. She had no new conflicts to disclose. 


10:10 AM     TEP goals (Dr. Smith-Bindman)
 
The goals of this TEP meeting include: 

1.    CMS and measure updates
2.    Beta testing overview and results, including:

a.    Missing data
b.    Exclusions
c.    CT category
d.    Size assessment
e.    Risk-adjusted upper radiation dose thresholds
f.    Image quality minimal floor thresholds

3.    Physician burden assessment
4.    TEP vote on the face validity of the measure

10:15 AM     CMS & measure updates (Dr. Smith-Bindman)

Corresponding hospital measure 

Dr. Smith-Bindman discussed that the TEP had previously shown support for developing a corresponding measure in the hospital reporting programs to facilitate physicians’ access to radiology data owned by hospitals, and align incentives between physicians and hospitals. 

At the invitation of CMS, Dr. Smith-Bindman and Alara Imaging, Inc. applied for support to develop the hospital measure in January 2021. However, CMS did not move forward with the funding opportunity, citing the new administration’s singular focus on COVID. Despite this decision, UCSF in partnership with Alara is moving forward with the hospital measure and will submit it to the Measures Under Consideration (MUC) List this May.

eCQM development 

In response to new direction from CMS, UCSF transitioned from developing the measure as a Qualified Clinical Data Registry (QCDR) to an electronic clinical quality measure (eCQM), aligning with CMS’s goal of moving towards all digital measures.

In the eCQM, all data elements are captured electronically from the EHR, Radiology Information System (RIS), and Picture Archive and Communication System (PACS). UCSF has authored the eCQM specifications and published the required value set for LOINC radiology data elements. In parallel to beta testing, UCSF is testing the eCQM using data from testing sites, but today’s presentation will focus on testing results concerning the measure logic and not this eCQM testing.  

UCSF will submit the MIPS measure to the MUC List in May, with final testing data submitted by the end of July, and to the NQF in August. They will host one final TEP meeting prior to NQF submission. [5/6/21 update: CMS has rescinded the extension to submit testing data; thus the complete measure will be submitted in May].

10:20 AM     Overview, exclusions, missing data & CT category (Dr. Smith-Bindman)

Calculation sequence

Dr. Smith-Bindman gave an overview of the measure calculation sequence: 


1.    Identify and exclude ineligible CT exams (e.g. exams done in combination with biopsies or nuclear medicine) 
2.    Assign CT exam to a CT category based on indication (using ICD10 and CPT codes)
3.    Calculate patient’s size (using DICOM data)
4.    Calculate size-adjusted radiation dose (using DICOM data)
5.    Calculate global noise, a measure of image quality (using DICOM data)
6.    Assess if dose or noise exceeds category-specific thresholds
7.    Calculate the proportion of out-of-range values at the level of the physician group (TIN)

Data Collection
Testing sites downloaded UCSF software on local servers/virtual machines and sent data from consecutive CT exams to the software, which abstracted data and exported the data to UCSF. The following data elements and data sources were collected: 

From radiology electronic systems

(i.e. PACS and/or RIS)

Radiation dose structured report (RDSR)         

Image pixel data

Variables on why and how CT performed

Linkage variables to allow data sources to be merged by patient

From EHR

ICD10 codes

Linkage variables

From billing systems, Charge Master, EHR, or RIS

CPT codes

Linkage variables

The measure is currently undergoing testing at eight sites. In accordance with the requirement that eCQMs be tested with data from multiple EHR systems, UCSF recently added two additional testing sites to represent diverse EHRs. These newest sites are still sending data and not represented in today’s presentation. 

Sample size and missing data

UCSF has thus far assembled data from six healthcare systems, including 13 physician groups (TINs) and 21,689 CT exams (slide 15). The sites include both inpatient and outpatient care settings. Overall, 5% of exams were ineligible due to exclusion criteria (e.g. non-diagnostic exams, exams in children). 6% were eligible but excluded due to missing data, with the primary contributor being missing RDSRs (slide 16). 

Radiation dose structured report (RDSR)

The RDSR is a digitized summary of radiation dose information. While there are other ways to abstract dose from exams, the RDSR is the most accurate, consistent, and standardized. This issue was raised at the previous TEP meeting when UCSF discovered all testing sites were missing RDSRs. Though CMS imposes as 15% penalty on facilities and physicians not complying with the nationwide National Electrical Manufacturers Association (NEMA) standard XR-29 that CT machines generate the RDSR, there is no requirement for saving the RDSR. Through discussions with sites, UCSF learned the process for saving RDSRs varies by vendor – simple for Siemens and Philips, labor-intensive for GE. One of our testing sites went from saving 0-96% of machines in a week’s time with support from Siemens. Another site with mostly GE machines increased saving from 10-65% within a month, changing one machine at a time. Nationwide, based on ACR registry data, about 75% of imaging facilities save the RDSR. The RDSR is nearly universally available, but regulation would help make it universally accessible. The TEP agreed that CMS should help advocate for policy change once the measure is adopted.  

Accuracy of CT categorization

Dr. Smith-Bindman describe the expectations underlying CT category assignment: 


1.    Head, chest, and abdomen exams should account for most exams.
2.    Routine dose categories should be more common than high or low dose categories.
3.    Categories assigned based on anatomic area alone (e.g. neck, extremity) should be highly accurate.


UCSF compared the accuracy of CT assignment based on ICD10 and CPT codes against a referent standard, previously shown to be over 90% accurate, which classifies exams based on natural language processing of DICOM metadata. UCSF has also obtained the full CT exam report for every exam, which is being used to improve the accuracy of the referent standard. 


Four categories were unexpectedly rare (<1%) in the testing data: head high dose; cardiac or chest high dose; simultaneous head and neck high dose; and simultaneous thoracic and lumbar spine (slide 19). UCSF continues to monitor these categories, but they are excluded from today’s presentation.


As expected, most exams were in the categories of head, chest, and abdomen. Similarly, the exams in routine dose category exceeded those in high and low; for example: abdomen low dose ranged from 1-4%; abdomen routine 24-36%, and abdomen high 3-6%. These numbers are similar in the UCSF Registry (slide 20). Niall Brennan asked why there was so much variation within categories, between sites. Dr. Smith-Bindman responded that it stems from a few factors: sites that do screening examples – e.g. for coronary artery calcification or lung cancer – would have a higher number of low dose exams. Location of the exam also matters; for example, hospitals do a lot more evaluation for trauma than outpatient centers. Lastly, sites that are major cancer referral centers tend to show a greater number of high dose exams. In general, the numbers reflect clinical information of the underlying population – i.e. the risk pool – rather than decisions made by radiologists. 

On the last slide in this section, Dr. Smith-Bindman showed the accuracy of CT categorization against the referent standard: over 90% in all but one site (slide 21). The UCSF team is working iteratively to improve both the referent standard and claims-based calculation algorithm, for example, to account for new codes added in 2021.

10:30 AM     Discussion: overview, exclusions, missing data & CT category (Dr. Sandy) 

Dr. Sandy introduced the discussion questions: 


1.    Do you have suggestions on how to reduce rate of missing data?
2.    Are you convinced that the approach for assigning CTs to the CT categories is sufficiently accurate for use in the measure?


Dr. Jeph Herrin asked for more information on the other types of missing data listed but not discussed: missing patient size and missing noise. Dr. Smith-Bindman explained that these data were not missing per se, but that the UCSF team was unable to measure them from the images received. Size is calculated by detecting the edge of the patient, where tissue meets air, based on image density in Hounsfield units (HU), and UCSF is currently exploring discrepancies in HUs across sites. This work is ongoing. It is expected that this source of missingness will be reduced or eliminated.


Dr. Lewis remarked that the concordance is “a little low, even among the referent standard.” He asked, “What difference does it make to the measure if there’s inaccuracy in the classification [of exams]?” Dr. Smith-Bindman explained that the dose thresholds are actually similar across categories (e.g. low vs. routine, routine vs. high); thus, the classification is important but not crucial. She stated the claims-based algorithm seems to be more accurate than the referent standard, but the accuracy numbers will improve as UCSF refines both methods. She explained, moreover, that the referent standard was based on data from only UCSF Health, and that they are now revising it based on highly descriptive data from testing sites as well as asking sites to confirm individual exams to ensure accuracy.

 
Dr. Sandy asked how easily are systems reprogrammed to save RDSRs? Dr. Tony Seibert shared his experience at UC Davis: despite having a sophisticated radiology IT team, they were unknowingly not saving the RDSR simply because they didn’t need to. They have mostly GE machines, which had to be changed manually, one protocol at a time. He said it is not unusual to have 300 or more protocols. He said, encouragingly, the ACR Dose Index Registry is requiring RDSRs, and that will be the type of impetus needed for universal saving. 


Dr. Ken Wang added wariness around physician and hospital inertia. Sites that are already tracking dose may find saving the RDSR useful, but sites without monitoring systems may lack motivation. 


Dr. Andy Bindman suggested it may be relatively easy for CMS to change the regulation. CMS’s current policy (requiring imaging providers meet the NEMA standard) makes it clear that they value the RDSR as an important element of dose data, and a small change in the language could ensure the report is saved. 

10:40 AM    Patient size and risk-adjusted dose thresholds (Dr. Smith-Bindman)

Dr. Smith-Bindman explained the goal behind the radiation dose threshold: as low as possible to support safety but not so low as to compromise image quality. She reiterated what was presented at previous TEP meetings: that dose thresholds were developed based on an Image Quality Study, in which doses were set per category at the level at which at least 90% of physicians rated images as acceptable (“excellent” or “adequate”). If at least 90% of physicians found quality acceptable at every dose level, the threshold was set at the median dose for that category from the UCSF Registry. Patient age and sex and machine make and model do not contribute a great deal to dose variation; thus, they are not adjusted for in the measure. Research shows optimized doses can be achieved on any machine.

The measure is risk-adjusted for patient size, which contributes significantly to radiation dose, as larger patient need higher doses. Dr. Smith-Bindman showed a table of distribution by size decile across sites, illustrating some medical groups serve a greater proportion of larger patients than others (slide 25). On slide 26 she showed UCSF’s size calculation resulted in similar measurements between categories (for example, similar diameter measurements in low, routine, and high dose abdomen), as expected. The extremity category shows a bimodal distribution, which reflects measured upper and lower extremities. 

Unadjusted, size will be the biggest driver of out-of-range scores on the measure, as illustrated in the table on slide 27 with 86% of exams in the highest decile out-of-range. When we adjust for size (which is measured on the mid-scan axial or coronal scout images), we observe similar out-of-range rates across all size deciles (slide 28). This ensures the measure does not penalize providers caring for heavier patients.  

On slide 29, she showed the radiation dose distribution for each medical group plotted in line graphs, with red lines indicating dose thresholds. This showed clearly that the thresholds for some sub-categories were close – such as low and routine dose abdomen – meaning errors in CT classification would have minor impact on out-of-range scores based on dose. 

Lastly on slide 30, she presented out-of-range rates by medical group. Overall rates ranged from 20-47%, illustrating considerable variability. 


10:50 AM    Discussion: Patient size and dose thresholds (Dr. Sandy)

Discussion questions:


1.    Does the approach for identifying CTs with out-of-range radiation doses capture exams with poor quality based on using too high a dose?
2.    Does the TEP endorse the risk-adjustment approach based on patient size?


Dr. Len Lichtenfeld commented that some of the out-of-range rates seem exceptionally high (e.g. 47%). Dr. Smith-Bindman reminded the group of how the thresholds were set: for categories without an observed dose threshold in the Image Quality Study (meaning all observed doses were acceptable based on radiologist scoring), they used the median from the UCSF Registry. The TEP members had multiple opportunities to weigh in on this decision and consistently supported using the median. For those categories, UCSF expected 50% out-of-range rates from the get-go. They also expect that current radiation doses are much higher than needed, and this is indeed what the data show. Dr. Lichtenfeld reiterated that the implications of such rates are significant.


Dr. Arjun Venkatesh wanted to know which categories are demarcated by the 90% satisfaction thresholds and which ones use the median. Dr. Smith-Bindman did not have the answer on hand but claimed both types of thresholds drive performance. Dr. Venkatesh pointed out that some categories, like chest routine dose, have very high out-of-range rates (54%, 71%, 100%), implying the dose threshold may be set too low. Dr. Smith-Bindman returned to the strongly established observation that doses are higher than needed, often by orders of magnitude. Missy Danforth concurred, stating the data presented today aligns with outcomes on the pediatric CT quality measure endorsed by the National Quality Forum (NQF), and clearly highlights the performance gap: “doses are generally high.” She commented that, while she had no clinical expertise, she found some of the rates in the 20% range to be rather low. 


Dr. Smith-Bindman returned to the dose distribution graphs on slide 29, which show long tails for categories using both the median and 90% satisfaction thresholds. These extreme outliers contribute a lot to out-of-range CT scans, but not to diagnostic accuracy. Dr. Sandy and then Dr. Bindman commented on the large, evident opportunity for dose reduction.


11:00 AM    Beta testing: image quality thresholds (Dr. Smith-Bindman)


Dr. Smith-Bindman reminded the TEP that the rationale for including image quality in the balancing measure was not to maximize image quality, but to protect against the untoward effect of incentivizing lower radiation dose. 


To evaluate image quality in an automated fashion, the UCSF team selected global noise as a measure of image quality. In general, a higher noise value correlates with worse image quality. Literature shows higher noise associated with missed diagnoses and lower physician satisfaction. For this measure, the global noise threshold was set at the noise level from the Image Quality Study at which at least 25% of physicians graded images as unacceptable (“poor” or “marginally acceptable”). Even in the Image Quality Study, this was a rare event: of the 25,000 interpretations, only 3% of images were rated poor and 8% marginally acceptable. The event was so rare, in fact, that it was impossible to set a higher threshold (e.g. 50% of physicians dissatisfied). In categories where less than 25% of physicians were dissatisfied (i.e. categories with no observed threshold), numbers from the literature were used to set thresholds. 


On slide 35, Dr. Smith-Bindman illustrated the global noise distributions by category in line graphs, with thresholds marked in red. Virtually all exams are far below threshold, suggesting radiation dose can be lowered, and noise increased, without impacting image quality. On slide 36, she showed out-of-range rates based on global noise; overall they range from 0-0.05%, suggesting image quality is sufficient across all testing sites. On slide 37, she presented overall out-of-range rates for medical groups based on both dose and noise (range = 20-48%). The numbers closely resemble the out-of-range rates for dose, as noise contributes little to measure failure. 


11:10 AM    Discussion: image quality thresholds (Dr. Sandy)


Discussion questions: 


1.    Is the approach for measuring image quality appropriate?
2.    Are the thresholds for measuring image quality adequate? 
3.    Do you agree with our approach for setting noise thresholds using the literature where the image quality study did not have a threshold (i.e. there was no observed threshold with close to 25% of physicians rating exams as poor or marginally acceptable)?
4.    What additional analyses of beta testing data would improve your confidence in our measure?

Dr. Seibert asked if the UCSF team is controlling for slice thickness or bone vs. soft tissue kernel, mentioning noise will vary significantly based on these factors. Dr. Smith-Bindman confirmed that they are adjusting for slice thickness. She explained: global noise is calculated on every slice and averaged across all slices within a series. In multi-phase exams, the best (i.e. lowest) noise value across all series is taken. The methodology is based on Dr. Ehsan Samei’s work, but what UCSF added is the step of taking the best cross-series, exam-level noise. (Dr. Samei’s method assessed noise only at the series level).

Dr. Krishna Nallamshetty supported the 25% threshold, suggesting: “If I'm looking at an image in the reading room with my colleagues, if more than 25% said it was too noisy, then I think that's probably a reasonable threshold to adjust the technique.”

11:25 AM    Physician burden Assessment (Dr. Smith-Bindman)

UCSF assessed burden through interviews with seven of the eight testing sites. The interview with the last testing site will occur after testing has completed at that site. Participants included site PIs (a radiologist or medical physicist) and everyone else involved in testing, including PACS administrators, IT personnel, and data analysts. Save for the effort of physicians acting as site PI, all effort was performed by staff, not physicians. The greatest effort was in initial setup, not ongoing work; thus if the testing were repeated, the time commitment would be lower in subsequent rounds. On slide 39, she presented the average number and range of hours involved in each step of testing: 

Step

Range (hours)

Average (hours)

Server/software set up

3-13

7.2

Migration of imaging exams to server

1-20

4.0

Sending EHR extract to software

3-25

11.3

Including RDSRs in PACS

1-50

25.3

 

Total

47.9

The average cost per hour of the personnel working on the project was $50. Therefore, the initial round of testing was completed at a cost of about $2500 per site.

11:30 AM    Discussion: Physician burden Assessment    (Dr. Sandy)

Discussion questions:


1.    Is the current burden comparable to other measures?
2.    Is this level of burden acceptable?


Ms. Danforth asked Dr. Smith-Bindman to explain how the software distribution would take place in a national implementation. Would CMS manage the software distribution? Dr. Smith-Bindman explained that her research team lacked the ability to develop professional grade reporting software at a scalable level. Thus she and UCSF colleagues founded a company (Alara Imaging, Inc.) that is developing new software to align with eCQM’s intent to pull data automatically with minimal human touch, through FHIR-based connections with the various data sources (EHR, PACS, RIS, etc.), and report aggregated scores to CMS.


Dr. Helen Burstin, Dr. Debra Ritzwoller, and Niall Brennan asked questions in the chat about whether burden was assessed in outpatient/ambulatory settings. Dr. Smith-Bindman confirmed outpatient facilities are represented in the testing data, including an exclusively outpatient practice with 17 imaging facilities (ARA). The remainder of testing sites, all large health systems, include outpatient settings. Burden is not expected to vary by inpatient vs. outpatient. 


Dr. Sandy asked Dr. Smith-Bindman to elaborate on the ongoing work required, beyond initial setup. She responded that one of the PACS administrators interviewed reported he checked on the data transfer twice a day while it was ongoing, but other sites reported less “babysitting.” Perhaps a good comparison is the UCSF Registry, which collects CT imaging data from over 160 hospitals and imaging facilities worldwide, continuously in real-time. The UCSF team monitors incoming volume and contacts sites when glitches occur, but generally speaking, operational burden is minimal.


Both Drs. Sandy and Lichtenfeld commented that CMS, NQF, and the public are interested in staff burden, even for measures with little to no burden on physicians. Drs. Lichtenfeld said the issue of staff burden comes up regularly on the Cancer Committee at NQF, on which he sits. He said, relative to other measures that require staff to do manual chart audits for every patient, this measure seems less burdensome off the bat. As much as UCSF can automate reporting, it will be to their advantage. He said UCSF will certainly face community barriers to implementation, so CMS needs to strongly “get behind it.” He gave the example of when the Mammography Quality Standards Act was reviewed, CMS asked specifically for implementation costs and ultimately agreed to increase mammogram payments to cover the cost.


Dr. Nallamshetty asked how UCSF planned to approach the scenario in which private practice radiologists read images generated by a hospital but have no authority over the IT or other hospital staff involved in reporting. If a corresponding measure is not adopted for the hospital program, “What's the motivation for the hospital facility to provide this level of man-hours to help something that's purely on the physician side?” Dr. Smith-Bindman agreed with him and stressed the importance of submitting the hospital measure. She mentioned that many hospitals employ personnel responsible for quality reporting both for physicians and facilities. The TEP hasn’t discussed this at depth, but those teams would need to learn how to access the radiology (non-EHR) data sources to implement this measure. 


Dr. Mythreyi Chatfield cautioned that the technical complexity of the measure may be the biggest barrier to implementation. The UCSF team understands this will be a key sticking point for reviewers and will address it accordingly. Dr. Sandy emphasized the need to underscore the importance of the measure, “How you frame it … will affect the perception of the burden.” This was echoed by both patient representatives in the chat, which are worth quoting in full:  


-    Dr. Debra Ritzwoller: I think this is an incredibly important and necessary quality metric.  But consistent with above, we need to figure out how to “sell” the importance of this measure, and figure out the best way to streamline the implementation of the needed IT/EHR and software issues.
-    Suz Schrandt: I think the patient community (if properly informed and equipped) can be a positive force for overcoming burden. If it is plainly a patient safety issue, and patients can speak up for their own and their family's safety, that seems like a pretty winnable issue.
As a final point on burden, Dr. Burstin and Ms. Danforth both advised that UCSF ensure the measure is tested in the intended population. While UCSF has included outpatient facilities in its testing, they should also try to test the measure in smaller, independent facilities, whose physicians may not belong to a TIN, or may not belong to a “large, more system-oriented” TIN like those tested thus far. 

11:40 AM    Face validity Assessment (Dr. Romano)

Next, Dr. Patrick Romano introduced the vote to formally endorse the face validity of the measure. All TEP members (including 2 non-voting) members were asked to vote yes or no on a series of five questions. (UCSF can identify respondent’s names on the backend.) Anyone voting no was asked to elaborate on their reason for disagreement in the chat. 

Unless otherwise indicated, the results below include later voting by the two panelists absent from the meeting (Drs. Hedvig Hricak and Todd Villines):

Do you agree that radiation dose is a relevant metric of quality for CT imaging?

100% agreement (N=19)

Do you agree that image noise is a relevant metric of quality for CT imaging?

*Dr. Romano clarified this question is not assessing noise as a standalone metric, but as part of a balancing measure.

100% agreement (N=19)

Do you agree that size is an appropriate method for adjusting for radiation dose for a given indication?

100% agreement (N=19)

Do you agree that performance on this measure of radiation dose and image quality, adjusted for size, stratified by indication, is a representation of quality?

100% agreement (N=19)

Do you agree that implementation of this measure is likely to lead to reductions in radiation dose while maintaining adequate image quality?

*Dr. Romano clarified this question assumes the burden issue is managed and the measure is implemented.

Of TEP members present on 4/27/21 (N=17):

Yes = 11/17 (64.7%) incl. 2 non-voting members

No = 4/17 (23.5%)

Abstained = 2/17 (11.8%)

The panelists who voted “no” on the final question offered a few comments suggesting discomfort with the binary choice:

 

-    “I voted No, though it is really "I don't know."

-    “I think the answer is maybe.”

-    “I voted No, but would like to vote "Possibly."  


One of the members who abstained requested greater clarity in the question, asking if we meant implementation in the MIPS program alone, or implementation in both the MIPS and hospital (inpatient and outpatient) reporting programs. Based on all feedback, on 5/3/21, we re-polled the group by email, dividing the final question into two and restructuring it along a 5-point Likert scale. 

How likely is it that implementation of this size-adjusted and stratified measure, as specified by the UC development team, in the Merit-based Incentive Payment System (MIPS), will lead to a reduction in average CT radiation dose while maintaining adequate CT image quality?

Very likely = 6/19 (32%), incl. 1 non-voting

 

Somewhat likely = 12/19 (63%) incl. 1 non-voting

 

Somewhat unlikely = 1/19 (5%)

How likely is it that implementation of this size-adjusted and stratified measure, as specified by the UC development team, in the MIPS and hospital quality reporting programs (inpatient/outpatient), will lead to a reduction in average CT radiation dose while maintaining adequate CT image quality?

Very likely = 11/19 (58%) incl. 1 non-voting

 

Somewhat likely = 7/19 (78%) incl. 1 non-voting

 

Somewhat unlikely = 1/19 (5%)


11:55 AM    Wrap Up and Next Steps (Dr. Smith-Bindman)

Dr. Smith-Bindman closed the meeting by thanking panelists and reminding them there would be one additional TEP meeting prior to the NQF submission later in the summer. 

12:00 PM    Adjourn (Dr. Sandy)